Gaussian Mixture Models For Extraction Of Melodic Lines From Audio Recordings
نویسنده
چکیده
The presented study deals with extraction of melodic line(s) from polyphonic audio recordings. We base our work on the use of expectation maximization algorithm, which is employed in a two-step procedure that finds melodic lines in audio signals. In the first step, EM is used to find regions in the signal with strong and stable pitch (melodic fragments). In the second step, these fragments are grouped into clusters according to their properties (pitch, loudness...). The obtained clusters represent distinct melodic lines. Gaussian Mixture Models, trained with EM are used for clustering. The paper presents the entire process in more detail and gives some initial results.
منابع مشابه
On Finding Melodic Lines in Audio Recordings
The paper presents our approach to the problem of finding melodic line(s) in polyphonic audio recordings. The approach is composed of two different stages, partially rooted in psychoacoustic theories of music perception: the first stage is dedicated to finding regions with strong and stable pitch (melodic fragments), while in the second stage, these fragments are grouped according to their prop...
متن کاملOptimizing Melodic Extraction Algorithm for Jazz Guitar Recordings Using Genetic Algorithms
Extraction of the main melody of a musical piece is a preliminary step in the process of transcribing the piece. Automatic melodic extraction is the task of computationally extracting what a human listener would perceive as the main melody of a polyphonic recording. Several melodic extraction systems have been proposed. However, such systems normally require a number of parameters to be manuall...
متن کاملSpeaker Clustering With Neural Networks And Audio Processing
Speaker clustering is the task of differentiating speakers in a recording. In a way, the aim is to answer "who spoke when" in audio recordings. A common method used in industry is feature extraction directly from the recording thanks to MFCC features, and by using well-known techniques such as Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM). In this paper, we studied neural network...
متن کاملMelodic Pattern Extraction in Large Collections of Music Recordings Using Time Series Mining Techniques
We demonstrate a data-driven unsupervised approach for the discovery of melodic patterns in large collections of Indian art music recordings. The approach first works on single recordings and subsequently searches in the entire music collection. Melodic similarity is based on dynamic time warping. The task being computationally intensive, lower bounding and early abandoning techniques are appli...
متن کاملMultimedia fusion in automatic extraction of studio speech segments for spoken document retrieval
This paper describes our progress in Cantonese spoken document retrieval. Over 60 hours of Cantonese television news broadcasts have been collected as part of AoE-IT Multimedia Repository. We have also developed the Multimedia Markup Language (MmML) for annotating the multimedia content in terms of anchor/field video frames and audio recordings. The audio tracks are indexed by a Cantonese sylla...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004